Search CORE

468 research outputs found

On Hidden Markov Processes with Infinite Excess Entropy

Author: Dębowski Łukasz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/11/2012
Field of study

We investigate stationary hidden Markov processes for which mutual information between the past and the future is infinite. It is assumed that the number of observable states is finite and the number of hidden states is countably infinite. Under this assumption, we show that the block mutual information of a hidden Markov process is upper bounded by a power law determined by the tail index of the hidden state distribution. Moreover, we exhibit three examples of processes. The first example, considered previously, is nonergodic and the mutual information between the blocks is bounded by the logarithm of the block length. The second example is also nonergodic but the mutual information between the blocks obeys a power law. The third example obeys the power law and is ergodic.Comment: 12 page

arXiv.org e-Print Archive

Springer - Publisher Connector

Mixing, Ergodic, and Nonergodic Processes with Rapidly Growing Information between Blocks

Author: Dębowski Łukasz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/11/2011
Field of study

We construct mixing processes over an infinite alphabet and ergodic processes over a finite alphabet for which Shannon mutual information between adjacent blocks of length

n

grows as

n^\beta

, where

\beta\in(0,1)

. The processes are a modification of nonergodic Santa Fe processes, which were introduced in the context of natural language modeling. The rates of mutual information for the latter processes are alike and also established in this paper. As an auxiliary result, it is shown that infinite direct products of mixing processes are also mixing.Comment: 21 page

arXiv.org e-Print Archive

Crossref

On the Vocabulary of Grammar-Based Codes and the Logical Consistency of Texts

Author: Dębowski Łukasz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/02/2011
Field of study

The article presents a new interpretation for Zipf-Mandelbrot's law in natural language which rests on two areas of information theory. Firstly, we construct a new class of grammar-based codes and, secondly, we investigate properties of strongly nonergodic stationary processes. The motivation for the joint discussion is to prove a proposition with a simple informal statement: If a text of length

n

describes

n^\beta

independent facts in a repetitive way then the text contains at least

n^\beta/\log n

different words, under suitable conditions on

n

. In the formal statement, two modeling postulates are adopted. Firstly, the words are understood as nonterminal symbols of the shortest grammar-based encoding of the text. Secondly, the text is assumed to be emitted by a finite-energy strongly nonergodic source whereas the facts are binary IID variables predictable in a shift-invariant way.Comment: 24 pages, no figure

arXiv.org e-Print Archive

Crossref

Variable-Length Coding of Two-Sided Asymptotically Mean Stationary Measures

Author: Dębowski Łukasz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We collect several observations that concern variable-length coding of two-sided infinite sequences in a probabilistic setting. Attention is paid to images and preimages of asymptotically mean stationary measures defined on subsets of these sequences. We point out sufficient conditions under which the variable-length coding and its inverse preserve asymptotic mean stationarity. Moreover, conditions for preservation of shift-invariant

\sigma

-fields and the finite-energy property are discussed and the block entropies for stationary means of coded processes are related in some cases. Subsequently, we apply certain of these results to construct a stationary nonergodic process with a desired linguistic interpretation.Comment: 20 pages. A few typos corrected after the journal publicatio

arXiv.org e-Print Archive

CiteSeerX

Universal Coding and Prediction on Martin-L\"of Random Points

Author: Dębowski Łukasz
Steifer Tomasz
Publication venue
Publication date: 07/05/2020
Field of study

We perform an effectivization of classical results concerning universal coding and prediction for stationary ergodic processes over an arbitrary finite alphabet. That is, we lift the well-known almost sure statements to statements about Martin-L\"of random sequences. Most of this work is quite mechanical but, by the way, we complete a result of Ryabko from 2008 by showing that each universal probability measure in the sense of universal coding induces a universal predictor in the prequential sense. Surprisingly, the effectivization of this implication holds true provided the universal measure does not ascribe too low conditional probabilities to individual symbols. As an example, we show that the Prediction by Partial Matching (PPM) measure satisfies this requirement. In the almost sure setting, the requirement is superfluous.Comment: 12 page

arXiv.org e-Print Archive